Search CORE

29 research outputs found

Mining clinical relationships from patient narratives

Author: A Rector
A Roberts
A Roberts
A Roberts
Angus Roberts
C Blaschke
C Friedman
C Giuliano
C Grover
C Nédellec
CB Ahlers
D Klein
D Lindberg
D Zelenko
Defense Advanced Research Projects Agency
G Doddington
G Zhou
H Cunningham
H Harkema
J Pustejovsky
J Thomas
K Fundel
M Goadrich
Mark Hepple
N Chinchor
N Sager
P Zweigenbaum
R Bunescu
R Gaizauskas
RC Bunescu
Robert Gaizauskas
S Katrenko
S Miller
S Pakhomov
T Rindflesch
T Wang
TC Rindflesch
U Hahn
W Chapman
Y Li
Y Lussier
Yikun Guo
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning (ML) approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to the extraction of clinical relationships. Results We have designed and implemented an ML-based system for relation extraction, using support vector machines, and trained and tested it on a corpus of oncology narratives hand-annotated with clinically important relationships. Over a class of seven relation types, the system achieves an average F1 score of 72%, only slightly behind an indicative measure of human inter annotator agreement on the same task. We investigate the effectiveness of different features for this task, how extraction performance varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships. Conclusion We have shown that it is possible to extract important clinical relationships from text, using supervised statistical ML techniques, at levels of accuracy approaching those of human annotators. Given the importance of relation extraction as an enabling technology for text mining and given also the ready adaptability of systems based on our supervised learning approach to other clinical relationship extraction tasks, this result has significance for clinical text mining more generally, though further work to confirm our encouraging results should be carried out on a larger sample of narratives and relationship types

Crossref

Springer - Publisher Connector

PubMed Central

White Rose Research Online

The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011

Author: A Casillas
A Vlachos
A Vlachos
Akinori Yonezawa
C Quirk
D McClosky
D Tuggener
E Emadzadeh
H Kilicoglu
H Kilicoglu
H Liu
H Poon
J Björne
J Björne
J Björne
JD Kim
JD Kim
JD Kim
JD Kim
Jin-Dong Kim
Jun'ichi Tsujii
KB Cohen
L Hirschman
M Miwa
M Miwa
N Chinchor
N Nguyen
Ngan Nguyen
NL Nguyen
Q Le Minh
QC Bui
S Riedel
S Riedel
S Riedel
Toshihisa Takagi
Y Kim
Yue Wang
Publication venue: BioMed Central
Publication date
Field of study

Crossref

PubMed Central

Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011

Author: A Morgan
A Riggs
A Vlachos
A Yeh
Bruno Sobral
C Arighi
C Nédellec
C Quirk
C Wang
CH Wei
CH Wu
Chunhong Mao
Chunxia Wang
D Barford
D McClosky
D McClosky
D McClosky
D McClosky
D Rebholz-Schuhmann
D Tikk
Dan Sullivan
DD Sleator
E Buyko
E Charniak
ES Witze
EW Noreen
H Kilicoglu
H Kilicoglu
H Kilicoglu
H Lee
H Liu
H Liu
H Poon
J Björne
J Björne
J Björne
J Björne
J Hakenberg
J Stock
J Tsujii
J Wermter
J Wilbur
JD Kim
JD Kim
JD Kim
JD Kim
JD Kim
JD Kim
Jun'ichi Tsujii
K Yoshikawa
L Hirschman
L McGrath
L Tanabe
M Ashburner
M Gerner
M Glickman
M Krallinger
M Miwa
M Miwa
M Narayanaswamy
M Ongenaert
M Porter
M Porter
MC de Marneffe
ME Winston
MS Simpson
N Chinchor
N Chinchor
N Nguyen
O Bodenreider
P Corbett
P Stenetorp
P Stenetorp
P Thomason
P Thompson
P Zweigenbaum
Q Le Minh
R Farkas
R Hoehndorf
R Holliday
R Holliday
R Jaenisch
R Leaman
Rafal Rak
S Ananiadou
S Ananiadou
S Ananiadou
S Pyysalo
S Pyysalo
S Pyysalo
S Pyysalo
S Pyysalo
S Pyysalo
S Pyysalo
S Riedel
S Riedel
S Riedel
S Riedel
S Strassel
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
Sampo Pyysalo
Sophia Ananiadou
T Krell
T Mascher
T Ohta
T Ohta
T Ohta
T Ohta
T Ohta
T Ohta
T Ohta
Tomoko Ohta
V Vincze
W Hersh
X Yuan
Y Gotoh
Y Sasaki
Y Tateisi
Y Wang
ZZ Hu
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from http://www.bionlp-st.org and the tasks continue as open challenges for all interested parties

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

Information retrieval and text mining technologies for chemistry

Author: Abacha A. B.
Alberts D.
Alfonso Valencia
American Chemical Society
Anália Lourenço
Aphinyanaphongs Y.
Appelt D. E.
Aramaki E.
Aronson A. R.
Asahara M.
Babych B.
Baeza-Yates R.
Bambenek J.
Barnard J. M.
Bast H.
Batista-Navarro R.
Batista-Navarro R. T.
Bian J.
Bies A.
Bikel D. M.
Blaschke C.
Brecher J. S.
Brill E.
Bunescu R.
Bunescu R. C.
Califf M. E.
Carpenter B.
Caruana R.
Chee B. W.
Chhieng D.
Chinchor N.
Chiticariu L.
Chowdhury M. F. M.
Chowdhury M. F. M.
Ciravegna F.
Cleverdon C. W.
Coden A.
Cohen R.
Collier N.
Corbett P.
Corbett P.
Cover T. M.
Craven M.
Cummings M. D.
Currano J. N.
Currano J. N.
Currano J. N.
Currano J. N.
Cutting D. R.
Davis C. H.
Dieb T. M.
Dieb T. M.
Dogan R. I.
Downs G. M.
Dunikowski L. G.
Embarek M.
Eom J.-H.
Faber J.
Fall C. J.
Fattore M.
Fennell R. W.
Freund Y.
Fujiyoshi A.
Fukuda K.
Gale W. A.
Garcelon N.
Garnier J.-P.
Garten Y.
Ginn R.
Giuliano C.
Gold S.
Grefenstette G.
Grishman R.
Gurulingappa H.
Gurulingappa H.
Gusfield D.
He Y.
Hearst M. A.
Hersh W.
Hersh W.
Hirschman L.
Hobbs J. R.
Hodge G. M.
Holzinger A.
Hsueh P.-Y.
Huber T.
Iyer S. V
Jackson P.
Joachims T.
Johnson D.
Jonnalagadda S.
Jonnalagadda S.
Julen Oyarzabal
Jurafsky D.
Kaewphan S.
Kaewphan S.
Karkaletsis V.
Katragadda S.
Kazama J.
Kazawa H.
Kelly L.
Kenny P. W.
Kim J.-D.
Kim Y.
Kleene S. C.
Kolárik C.
Kongburan W.
Kornai A.
Kraaij W.
Krallinger M.
Krallinger M.
Krallinger M.
Kremer G.
Kreuzthaler M.
Kucera H.
Lai H.
Lawson A. J.
Leaman R.
Leaman R.
Lee C.-H.
Levenshtein V. I.
Levin M. A.
Li J.
Li N.
Li Y.
Liu X.
Locke W. N.
Lovins J. B.
Lowe D. M.
Lupu M.
Lupu M.
Mackenzie C. E.
Manning C. D.
Mansouri A.
Martin E.
Martin Krallinger
Mattmann C.
Maynard D.
McCallum A.
McEwen L.
McKnight L.
McNaught A.
Meystre S. M.
Michalski S. R.
Michie D.
Mihalcea R.
Mitton R.
Miwa M.
Mollá D.
Murray-Rust P.
Müller B.
Nebel A.
Nikfarjam A.
Névéol A.
Névéol A.
Obdulia Rabal
Pang B.
Panico R.
Perez-Iratxeta C.
Ponomareva N.
Ratinov L.
Ratnaparkhi A.
Read J.
Rebholz-Schuhmann D.
Reeker L. H.
Rocchio J. J.
Rohbeck H.-G.
Rosario B.
Roth D. L.
Rupp C. J.
Rupp C. J.
Sagae K.
Salim N.
Salton G.
Sanchez-Cisneros D.
Saracevic T.
Sasaki Y.
Schapire R. E.
Schenck R.
Schenck R. J.
Schlaf A.
Schuemie M. J.
Segura Bedmar I.
Segura-Bedmar I.
Sekine S.
Sequeira E.
Settles B.
Settles B.
Sewell W.
Shen D.
Shidha M. V
Singhal A.
Smith E. G.
Stamatatos E.
Sutton C.
Sætre R.
Taylor K. T.
Tharatipyakul A.
Tomanek K.
Tomanek K.
Tsuruoka Y.
Tsuruoka Y.
Täger W.
Urbain J.
van Rijsbergen C. J.
Vapnik V. N.
Vasserman A.
Visweswaran S.
Voorhees E. M.
Wang W.
Wang Y.
Wei C.-H.
Wei C.-H.
Wermter J.
Wilbur W. J.
Willett P.
Willett P.
Williams A. J.
Witten I. H.
Workman M. L.
Wrublewski D. T.
Xu R.
Xue N.
Yan S.
Yang C.
Yang C. C.
Yang Y.
Zass E.
Zipf G. K.
Zipf G. K.
Zitnik S.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2017
Field of study

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio

Universidade do Minho: RepositoriUM

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Evaluating Web search result summaries

Author: C. Sparck Jones
E.M. Voorhees
H. Borko
I. Mani
N. Chinchor
S. Afantenos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Abstract. The aim of our research is to produce and assess short summaries to aid users ’ relevance judgements, for example for a search engine result page. In this paper we present our new metric for measuring summary quality based on representativeness and judgeability, and compare the summary quality of our system to that of Google. We discuss the basis for constructing our evaluation methodology in contrast to previous relevant open evaluations, arguing that the elements which make up an evaluation methodology: the tasks, data and metrics, are interdependent and the way in which they are combined is critical to the effectiveness of the methodology. The paper discusses the relationship between these three factors as implemented in our own work, as well as in SUMMAC/MUC/DUC.

CiteSeerX

Crossref

Sunderland University Institutional Repository

Multimedia Analysis + Visual Analytics = Multimedia Analytics

Author: J J Thomas
M G Christel
N A Chinchor
P C Wong
W Ribarsky
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Learning information extraction patterns from examples

Author: G. Miller
J. R. Hobbs
K. VanLehn
M. Pazzani
N. Chinchor
R. J. Hall
S. Soderland
W. Lehnert
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Quality control in the concept learning process

Author: D. Aha
F. Gomez
K. Moorman
L. Rau
N. Chinchor
R. MacGregor
U. Hahn
U. Hahn
U. Hahn
W. Woods
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

How features embedded in eWOM predict hotel guest satisfaction: an application of artificial neural networks

Author: Chinchor N.
Coelho A.
Hansen S. S.
Ling C. X.
Pei Y. L.
Webb G. I.
Witten I. H.
Witten I. H.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Natural language processing for information retrieval

Author: Chinchor N.
David D. Lewis
Fagan J.L.
Harman D.K. Ed.
Karen Spärck Jones
Parsaye K.
SparckJones K.
Willett P. Ed.
Young S.R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref